Discovering Distributional Thesauri Semantic Relations
نویسنده
چکیده
The paper presents technique and analysis to discover distributional thesauri relations by using statistical similarity of different word’s contexts. The application uses educational electronic text corpus and the Sketch Engine software statistical search to extract and compare word’s collocations from the related text corpus. The semantic search used is based on the evaluation and comparison of common keyword’s collocations by generation distributional thesauri word’s semantic relations and words sketch differences. The results of the related search experiments for British Academic Spoken English corpus are evaluated and presented.
منابع مشابه
Unsupervised selection of semantic relations for improving a distributional thesaurus (Sélection non supervisée de relations sémantiques pour améliorer un thésaurus distributionnel) [in French]
Unsupervised selection of semantic relations for improving a distributional thesaurus Work about distributional thesauri has shown that the relations in these thesauri are mainly reliable for high frequency words. In this article, we propose a method for improving such a thesaurus through its re-balancing in favor of low frequency words. This method is based on a bootstrapping mechanism : a set...
متن کاملAutomatic thesaurus construction
In this paper we introduce a novel method of automating thesauri using syntactically constrained distributional similarity. With respect to syntactically conditioned cooccurrences, most popular approaches to automatic thesaurus construction simply ignore the salience of grammatical relations and effectively merge them into one united ‘context’. We distinguish semantic differences of each syntac...
متن کاملB2SG: a TOEFL-like Task for Portuguese
Resources such as WordNet are useful for NLP applications, but their manual construction consumes time and personnel, and frequently results in low coverage. One alternative is the automatic construction of large resources from corpora like distributional thesauri, containing semantically associated words. However, as they may contain noise, there is a strong need for automatic ways of evaluati...
متن کاملUsing Grammatical Relations to Automate Thesaurus Construction
In this paper we introduce a novel method of automating thesauri using syntactically constrained distributional similarity. With respect to syntactically conditioned co-occurrences, most popular approaches to automatic thesaurus construction simply ignore the salience of grammatical relations and effectively merge them into one united ‘context’. We distinguish semantic differences of each synta...
متن کاملIdentifying Bad Semantic Neighbors for Improving Distributional Thesauri
Distributional thesauri are now widely used in a large number of Natural Language Processing tasks. However, they are far from containing only interesting semantic relations. As a consequence, improving such thesaurus is an important issue that is mainly tackled indirectly through the improvement of semantic similarity measures. In this article, we propose a more direct approach focusing on the...
متن کامل